AITopics | cluster shape

Collaborating Authors

cluster shape

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Flexible Bivariate Beta Mixture Model: A Probabilistic Approach for Clustering Complex Data Structures

Hsu, Yung-Peng, Chen, Hung-Hsuan

arXiv.org Artificial IntelligenceFeb-27-2025

This unsupervised learning method is widely used in various applications, including image analysis, information retrieval, text analysis, bioinformatics, and many more [1, 2, 3, 4]. Clustering helps uncover the underlying structure of the data, facilitates data summarization, and sometimes serves as a preprocessing step for other algorithms [2]. Despite its widespread use, one of the primary challenges many traditional clustering algorithms face is that they often assume that the data points form clusters with convex shapes. For example, centroid-based algorithms like k -means and distribution-based models like Gaussian Mixture Models (GMM) typically produce clusters that are hyperspherical or ellipsoidal [5]. Although this assumption simplifies the clustering process, it restricts the flexibility of these models to handle complex data distributions that do not conform to convex shapes.

beta distribution, bivariate beta distribution, dataset, (14 more...)

arXiv.org Artificial Intelligence

2502.19938

Country:

Asia > Taiwan (0.04)
North America > United States > California > Alameda County > Oakland (0.04)
Europe > Italy (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Multivariate Beta Mixture Model: Probabilistic Clustering With Flexible Cluster Shapes

Hsu, Yung-Peng, Chen, Hung-Hsuan

arXiv.org Artificial IntelligenceJan-29-2024

Data clustering groups data points into components so that similar points are within the same component. Data clustering is commonly used for data exploration and is sometimes used as a preprocessing step for later analysis [1]. In this paper, the multivariate beta mixture model (MBMM), a new probabilistic model for soft clustering, is proposed. As the MBMM is a mixture model, it shares many properties with the Gaussian mixture model (GMM), including its soft cluster assignment and parametric modeling. In addition, the MBMM allows the generation of new (synthetic) instances based on a generative process. Because the beta distribution is highly flexible (e.g., unimodal, bimodal, straight line, or exponentially increasing or decreasing), MBMM can fit data with versatile shapes.

algorithm, beta distribution, multivariate beta distribution, (14 more...)

arXiv.org Artificial Intelligence

2401.16708

Country:

Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
North America > United States > Wisconsin (0.04)
Asia > Taiwan (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

ShaRP: Shape-Regularized Multidimensional Projections

Machado, Alister, Telea, Alexandru, Behrisch, Michael

arXiv.org Artificial IntelligenceJun-1-2023

Projections, or dimensionality reduction methods, are techniques of choice for the visual exploration of high-dimensional data. Many such techniques exist, each one of them having a distinct visual signature - i.e., a recognizable way to arrange points in the resulting scatterplot. Such signatures are implicit consequences of algorithm design, such as whether the method focuses on local vs global data pattern preservation; optimization techniques; and hyperparameter settings. We present a novel projection technique - ShaRP - that provides users explicit control over the visual signature of the created scatterplot, which can cater better to interactive visualization scenarios. ShaRP scales well with dimensionality and dataset size, generically handles any quantitative dataset, and provides this extended functionality of controlling projection shapes at a small, user-controllable cost in terms of quality metrics.

data mining, machine learning, projection, (15 more...)

arXiv.org Artificial Intelligence

2306.00554

Country:

Europe > Netherlands (0.04)
North America > United States > California > Santa Clara County > San Jose (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.90)

Add feedback

repliclust: Synthetic Data for Cluster Analysis

Zellinger, Michael J., Bühlmann, Peter

arXiv.org Artificial IntelligenceMar-24-2023

Our approach is based on data set archetypes, high-level geometric descriptions from which the user can create many different data sets, each possessing the desired geometric characteristics. The architecture of our software is modular and object-oriented, decomposing data generation into algorithms for placing cluster centers, sampling cluster shapes, selecting the number of data points for each cluster, and assigning probability distributions to clusters.

artificial intelligence, machine learning, overlap, (17 more...)

arXiv.org Artificial Intelligence

2303.14301

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > Canada > Alberta (0.14)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

Add feedback

K-expectiles clustering

Wang, Bingling, Li, Yinxing, Härdle, Wolfgang Karl

arXiv.org Machine LearningMar-16-2021

$K$-means clustering is one of the most widely-used partitioning algorithm in cluster analysis due to its simplicity and computational efficiency. However, $K$-means does not provide an appropriate clustering result when applying to data with non-spherically shaped clusters. We propose a novel partitioning clustering algorithm based on expectiles. The cluster centers are defined as multivariate expectiles and clusters are searched via a greedy algorithm by minimizing the within cluster '$\tau$ -variance'. We suggest two schemes: fixed $\tau$ clustering, and adaptive $\tau$ clustering. Validated by simulation results, this method beats both $K$-means and spectral clustering on data with asymmetric shaped clusters, or clusters with a complicated structure, including asymmetric normal, beta, skewed $t$ and $F$ distributed clusters. Applications of adaptive $\tau$ clustering on crypto-currency (CC) market data are provided. One finds that the expectiles clusters of CC markets show the phenomena of an institutional investors dominated market. The second application is on image segmentation. compared to other center based clustering methods, the adaptive $\tau$ cluster centers of pixel data can better capture and describe the features of an image. The fixed $\tau$ clustering brings more flexibility on segmentation with a decent accuracy.

algorithm, k-expectile, k-means, (14 more...)

arXiv.org Machine Learning

2103.09329

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Berlin (0.04)
Asia > Taiwan (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry: Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Beyond 4D Tracking: Using Cluster Shapes for Track Seeding

Fox, Patrick J., Huang, Shangqing, Isaacson, Joshua, Ju, Xiangyang, Nachman, Benjamin

arXiv.org Machine LearningDec-8-2020

Analyzing data from the Large Hadron Collider (LHC) present a hyper challenge. A given collision event may result in hundreds of outgoing particles, each with many features (momentum, electric charge, etc.). This hyper variate phase space is then observed by complex multi-channel detectors that are essentially hyperspectral cameras. The LHC detectors have millions of readout channels and dimensionality reduction is essential for data analysis. One natural and nearly lossless reduction is the reconstruction of charged particle trajectories ('tracks'). The innermost layers of the detectors at the LHC are constructed to register the passage of charged particles without significantly altering the particle energy or direction. In the ATLAS and CMS detectors, this is achieved using silicon sensors that are finely segmented in one or two directions and are called strips and pixels, respectively. We will focus on pixels, although our methodology applies more generally. Typically, the first step in a tracking algorithm is the construction of seeds, which are sets of three or more hit pixel clusters that can be used to fit charged-particle trajectories (see e.g.

cluster shape, information, neural network, (16 more...)

arXiv.org Machine Learning

2012.04533

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Africa > Zambia > Southern Province > Choma (0.04)
North America > United States > Illinois > Kane County > Batavia (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)
Information Technology > Data Science (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Principal Ellipsoid Analysis (PEA): Efficient non-linear dimension reduction clustering

#artificialintelligenceAug-19-2020, 13:56:18 GMT

Even with the rise in popularity of over-parameterized models, simple dimensionality reduction and clustering methods, such as PCA and k-means, are still routinely used in an amazing variety of settings. A primary reason is the combination of simplicity, interpretability and computational efficiency. The focus of this article is on improving upon PCA and k-means, by allowing non-linear relations in the data and more flexible cluster shapes, without sacrificing the key advantages. The key contribution is a new framework for Principal Elliptical Analysis (PEA), defining a simple and computationally efficient alternative to PCA that fits the best elliptical approximation through the data. We provide theoretical guarantees on the proposed PEA algorithm using Vapnik-Chervonenkis (VC) theory to show strong consistency and uniform concentration bounds.

artificial intelligence, efficient non-linear dimension reduction, machine learning, (4 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.44)

Add feedback